Audio head pose estimation using the direct to reverberant speech ratio

نویسندگان
چکیده

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

PSD estimation in Beamspace for Estimating Direct-to-Reverberant Ratio from A Reverberant Speech Signal

A method for estimation of direct-to-reverberant ratio (DRR) using a microphone array is proposed. The proposed method estimates the power spectral density (PSD) of the direct sound and the reverberation using the algorithm PSD estimation in beamspace with a microphone array and calculates the DRR of the observed signal. The speech corpus of the ACE (Acoustic Characterisation of Environments) C...

متن کامل

Direct-to-Reverberant Ratio Estimation on the ACE Corpus Using a Two-channel Beamformer

Direct-to-Reverberant Ratio (DRR) is an important measure for characterizing the properties of a room. The recently proposed DRR Estimation using a Null-Steered Beamformer (DENBE) algorithm was originally tested on simulated data where noise was artificially added to the speech after convolution with impulse responses simulated using the image-source method. This paper evaluates the performance...

متن کامل

Estimation of the direct-to-reverberant Energy Ratio using a spherical microphone array

This paper proposes a practical approach to estimate the direct-toreverberant energy ratio (DRR) using a spherical microphone array without having knowledge of the source signal. We base our estimation on a theoretical relationship between the DRR and the coherence estimation function between coincident pressure and particle velocity. We discuss the proposed method’s ability to estimate the DRR...

متن کامل

Sampling techniques for audio-visual tracking and head pose estimation

Analyzing people behaviors in smart environment using multimodal sensors requires to answer a set of typical questions: who are the people, where are they, what activities are they doing, when, with whom are they interacting, and how. In this view, locating people or their faces and characterizing them (e.g. extracting their body or head orientation) allows to address the first two questions (w...

متن کامل

Ideal Ratio Mask Estimation Using Deep Neural Networks for Monaural Speech Segregation in Noisy Reverberant Conditions

Monaural speech segregation is an important problem in robust speech processing and has been formulated as a supervised learning problem. In supervised learning methods, the ideal binary mask (IBM) is usually used as the target because of its simplicity and large speech intelligibility gains. Recently, the ideal ratio mask (IRM) has been found to improve the speech quality over the IBM. However...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Speech Communication

سال: 2016

ISSN: 0167-6393

DOI: 10.1016/j.specom.2016.09.005